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ABSTRACT 

Background: Using differential display (DD), we dis- 
covered a new member of the serine protease family of 
protein- cleaving enzymes, named protease M. The gene 
is most closely related by sequence to the kallikreins, to 
prostate-specific antigen (PSA), and to trypsin. The diag- 
nostic use of PSA in prostate cancer suggested that a 
related molecule might be a predictor for breast or ovar- 
ian cancer. This, in turn, led to studies designed to char- 
acterize the protein and to screen for its expression in 
cancer. 

Materials and Methods: The isolation of protease M 
by DD, the cloning and sequencing of the cDNA, and the 
comparison of the predicted protein structure with re- 
lated proteins are described, as are methods to produce 
recombinant proteins and polyclonal antibody prepara- 
tions. Protease M expression was examined in mam- 
mary, prostate, and ovarian cancer, as well as normal, 
cells and tissues. Stable transfectants expressing the pro- 



tease M gene were produced in mammary carcinoma 
cells. 

Results: Protease M was localized by fluorescent in situ 
hybridization analysis to chromosome 19q 13.3, in a re- 
gion to which other kallikreins and PSA also map. The 
gene is expressed in the primary mammary carcinoma 
lines tested but not in the corresponding cell lines of 
metastatic origin. It is strongly expressed in ovarian can- 
cer tissues and cell lines. The enzyme activity could not 
be established, because of difficulties in producing suffi- 
cient recombinant protein, a common problem with pro- 
teases. Transfectants were selected that overexpress the 
mRNA, but the protein levels remained very low. 
Conclusions: Protease M expression (mRNA) may be a 
useful marker in the detection of primary mammary 
carcinomas, as well as primary ovarian cancers. Other 
medical applications are also likely, based on sequence 
relatedness to trypsin and PSA. 



INTRODUCTION 

Serine proteases are protein-cleaving enzymes 
that contain a serine residue in their active sites 
and which play important roles in diverse phys- 

Address correspondence and reprint requests to: Ruth 
Sager, Dana-Farber Cancer Institute Division of Cancer Ge- 
netics, 44 Binney Street, Boston, MA 02115, U.SA. 
The nucleotide sequence (s) reported in this paper has been 
submitted to the GenBank/EMBL Data Bank with accession 
number U62801. 



iological processes, including digestion (e.g., 
trypsin, chymotrypsin) and blood clotting (e.g., 
plasminogen activator, thrombin). Serine pro- 
teases also act as regulators of a variety of pro- 
cesses by proteolytic activation of precursor 
proteins. 

The kallikreins are a subfamily of serine 
proteases originally defined as those cleaving 
vasoactive peptides (kinins) from kininogen (1). 
Currently, the kallikreins comprise a large, mul- 
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tigene family in rodents, although only three 
members of this family, hKLKl, hKLK2, and 
hKLK3, are known in humans. These three 
genes encode the proteins pancreatic/renal kal- 
likrein (hKl), glandular kallikrein (hK2), and 
prostate-specific antigen (PSA; hK3), respec- 
tively (2). 

The hKl protein is secreted from pancreas, 
kidney, and salivary glands (3), and is the only 
member of the family having true kallikrein ac- 
tivity. Its major function is the generation of 
kinins from kininogens and the regulation of 
blood pressure (1). The hK2 protein has yet to be 
detected i» human tissue or fluids, but its se- 
quence has been inferred from a genomic clone 
(4), as well as cDNA clones isolated from prostate 
libraries (5). hK2 expression is specific for pros- 
tate and is regulated by androgens (5). PSA is 
produced predominantly in males by prostate 
epithelial cells and is secreted into the seminal 
fluid, where it serves to degrade the gel-like 
seminogelin protein and increase sperm motility 
(6,7). Although PSA is produced at higher levels 
in normal than in malignant prostate tissue, a 
defect in the malignant tissues ultimately results 
in the leakage of PSA into the 'bloodstream (8). 
This is the basis of the use of PSA as a circulating 
tumor marker for prostate cancer. / 

Here, we describe the isolation by differential 
display (9-11) of a novel member of the serine 
protease family which is most homologous to 
trypsin and members of the kallikrein family. 
This novel protein, which we have named pro- 
tease M, is down-regulated in metastatic breast 
cancer lines but strongly expressed.at the mRNA 
level in some primary breast cancer cell lines and 
in ovarian cancer tissues and tumor cell lines. 



MATERIALS AND METHODS 

Mammary Cell Strains and Lines 

Normal human mammary epithelial cell strains 
(70N and 76N) were derived from reduction 
mammoplasties in this laboratory as described 
(12). Primary (21PT, 21NT) and metastatic 
(21MT-1, 21MT-2) tumor lines were established 
in this lab from a single patient as described 
(12 13). Human mammary epithelial tumor cell 
lines MCF-7, T47D, ZR75-1, BT549, MDA-MB- 
157, MDArMB-231, MDA-MB-435, MDA- 
MB436, MDA-MB-361, and BT-474 were ob- 
tained from American Tissue Culture Collection 
(ATCC, Rockville, MD, U.S.A.). Cells were growjn 

i 

| 



in DFCI-I media (12) and harvested at approxi- 
mately 70% confluence for RNA isolation. 



Prostate Cell Lines 

Normal, immortalized prostate epithelial cell 
lines: CF3 (HPV immortalized), CF91 (SV40 im- 
mortalized), and MLC (SV40 immortalized) were 
provided by Dr. Johng Rhim and were cultured 
in KGM medium (DIFCO, Detroit, MI, U.S.A.). 
The tumor cell lines DU145, LNCaP, and PC3 
(ATCC) were grown in Dulbecco's modified Ea- 
gle's medium (DMEM) plus 10% fetal calf serum 
(PCS, Hyclone, Logan, UT, U.S.A.). 

Ovarian Cell Cultures and Tissues 

The primary human ovarian surface epithelial 
cell cultures (HOSE 10/11, 16, and 21) were 
established from the ovarian surface epithelium 
as described (14). Immortalized ovarian surface 
epithelial cells (HOSE6.3E6E7) were obtained by 
infecting the HOSE cells with a replication-defec- 
tive retrovirus construct, LXSN16E6E7, as de- 
scribed (14). The eight ovarian carcinoma cell 
lines used for this comparative study include 
DOV13, OVCA420, OVCA429, OVCA432, and 
OVCA433, which were established in the labo- 
ratory of Gynecologic Oncology; CAOV3 and 
SKOV3, which were purchased from ATCC; and 
OVCA3, which was obtained from the National 
Cancer Institute (Frederick, MD, U.S.A.). 

Ovarian tumors obtained (15) include 6 bor- 
derline ovarian tumors (354A, 373A, 395A, 
405A, 466A, and 469A); 20 stage ffl/IV high 
grade invasive ovarian adenocarcinomas from 
the primary ovarian site; 2 metastatic adenocar- 
cinoma from colon primary tumors (327 A, 
339A); and 3 normal ovaries (366N, 379N, and 
465N). 

Differential Display of mRNA 

Total cell RNAs (50 /ig) from 21PT and 21MT-1 
were treated with DNAasel (Worthington DPRF, 
Freehold, NJ, U.S.A.) in the presence of RNAasin 
ribonuclease inhibitor (Promega, Madison, WI, 
U.S.A.) to remove residual DNA contamination, 
as described elsewhere (11). Differential display 
of the mRNA was performed as described (9,10). 
Basically, the RNAs were reverse transcribed us- 
ing the 3'-anchored primer T 12 MG (where M is a 
mixture of A, G, or C). The resultant cDNAs were 
then polymerase chain reaction (PCR) amplified 
in the presence of 35 S-dATP using T 12 MG and the 

I 
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arbitrary primer OPA1 (CAGGCCCTTC) and run 
in adjacent lanes on a 6% sequencing gel. Dif- 
ferentially displayed bands were recovered from 
the dried gel, reamplified by PCR, 32 P-labeled by 
the oligo method (16) and used as a probe on 
Northern blots prepared with 21PT and 21MT-1 
total. 

Cloning, Sequencing, and Analysis of 
cDNAs 

The reamplified band from differential display 
was cloned into the TA cloning vector PCRII 
(Invitrogerv San Diego, CA, U.S.A.) and se- 
quenced on both strands using T7 and SP6 prim- 
ers. cDNA libraries from 21PT and 76N cells con- 
structed in Lambda Zap II (Stratagene, San 
Diego, CA, U.S.A.), were screened using the 
cloned PCR product as a probe and several cDNA 
clones were isolated and sequenced on both 
strands. The longest cDNA clone (from the 76N 
library) was sequenced on both strands using an 
ABI automated sequencer (Model 373A) by the 
Dana-Farber Molecular Biology Core Facility. 
Oligonucleotides used for sequencing were syn- 
thesized by the Dana-Farber Molecular Biology 
Core facility or by Amitof, Inc. (Cambridge, MA, 
U.S.A.). The predicted protein coding region and 
non-translated regions were determined and for- 
matted using the GCG Publish program. The pre- 
dicted protein sequence was compared to protein 
databases using the Blast algorithm (17). Protein 
alignment with related proteins was performed 
on GCG using the Piieup, Distances, and Pretty- 
plot programs. 

Northern Analysis 

Total cell RNA was isolated by the guanidinium 
isothiocyanate/cesium chloride method and an- 
alyzed on Northern blots as described (18). 36B4 
(19), a ribosomal protein whose message is con- 
stant under a variety of conditions, was used to 
normalize the blots. Densitometric analysis of 
autoradiographs was performed with an imaging 
densitomer (Biorad GS-700) using the Molecular 
Analyst software. 

Mapping of the Protease M Gene 

A panel of 24 human-rodent somatic cell hybrids 
(Mapping panel 2 from NIGMS, Coriell Institute 
for Medical Research, Camden, NJ, U.S.A.) was 
used for mapping the protease M gene. DNAs 
from hybrid and parental cell lines were re- 



stricted with EcoRI, electrophoretically separated 
in 0.8% agarose gels, and transferred to nylon 
filters. Blots were hybridized with a 32 P-labeled 
protease M cDNA probe (IG3-8) corresponding 
to nucleotides 566 to 1526. For fine mapping of 
the protease M gene, fluorescent in situ hybrid- 
ization (FISH) was performed on normal human 
lymphocyte spreads as previously described (20). 
A A clone (AIg3-l) containing the 5' portion of 
the protease M gene (from position 1 to 1016) 
was labeled with biotin by nick translation and 
co-hybridized with an a-satellite probe specific 
for chromosomes 1, 5, and 19 (D1Z7/D5Z2/ 
D19Z3; Oncor, Gaithersburg, MD, U.S.A.). Chro- 
mosomes were counterstained with 4.6-dia- 
mino-2-phenylindole (DAPI). Slides were 
examined in a Zeiss Axiophot epifluorescence 
microscope using the appropriate filter combina- 
tions. Fluorescence signals were digitalized, en- 
hanced, and analyzed using the ProbeMaster 
FISH image analysis system (Perceptive Scientific 
Instruments, Houston, TX, U.S.A.). 

Production of Polyclonal Antibody and 
Western Blotting 

The multiple antigen peptide (MAP) (21) 
73 GKNNLRQRESSQEQS 87 (0.5 mg) was emulsi- 
fied with an equal volume of Freund's adjuvant 
and injected into 3- to 9-month-old New Zealand 
white rabbits. Boosts were done 2 and 6 weeks 
later. The animals were bled and serum was col- 
lected and stored at -20°C. Peptide and antibody 
production was done at Research Genetics 
(Huntsville, AL, U.S.A.). 

Whole cell lysates were prepared by sonicat- 
ing 10 7 cells/ml for twenty 30-sec pulses in a 
Sonicator Ultrasonic Processor in mammalian ly- 
sis buffer (4 mM NaHC0 3 , 100 mM NaF, 20 mM 
KH 2 P0 4 , 2 mM sodium orthovanadate, 5 mM 
EDTA, 5 mM disfluorophosphate, 2 mM PMSF, 2 
^g/ml leupeptin, 2 ^g/ml aprotinin, pH 7.2). 
Lysates were clarified by spinning at 14,000 X g 
for 30 min in a microfuge. 

Fifty to 100 /xg of cell lysate was denatured 
by heating in SDS-PAGE sample buffer (50 mM 
Tris-HCl, pH 6.8, 0.1 mM DTT, 2% SDS, 0.1% 
bromphehol blue, 10% glycerol) at 90°C for 5 
min and run on a 12% acrylamide/SDS minigel 
(Biorad), electroblotted onto a PDVF membrane 
(0.2 fjb f Biorad), and reacted with immune serum 
(1:1000). Anti-rabbit IgG horseradish peroxi- 
dase-linked whole antibody (Amersham) 
(1:2000) was used as secondary antibody, and 
immunoreactive bands were detected with en- 
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hanced chemiluminescence (ECL; Amersham, 
Arlington Heights, IL, U.S.A.). 

Expression of GST Fusion Protein 

The full-length cDNA clone was PCR amplified 
using the sense 5' 26-mer oligonucleotide 5'- 
GGAATTC C GTTGGTGC ATGGC GG AC C - 3 ' and 
the antisense 3' oligonucleotide 5'-GTCGGAATT 
CAGGGTCACTTGGCCTG-3' at 95°C, 1 min, 
60°C, 1 min, 72°C, 1 min for 30 cycles to yield a 
0.7-kb product which contained the open read- 
ing frame without the hydrophobic N-terminal 
amino acids. The resultant PCR product encoding 
for leu 22 to lys 244 was digested with EcoW and 
ligated to alkaline phosphatase -treated EcoRI lin- 
earized pGEX-2T vector (Pharmacia, Piscataway, 
NJ, U.S.A.) to produce plasmids encoding a GST- 
protease M fusion protein. Escherichia coli strains 
XL-1 blue or DH5A transformed with this con- 
struct were grown and induced with 0.2 mM 
IPTG at 37°C for 1 hr to produce GST fusion 
protein which was solubilized from bacteria and 
purified on glutathionine agarose beads by stan- 
dard methods (22). 

Expression of Baculovirus 
Recombinant Protein 

A full-length cDNA clone was cut with EcoNl 
and BstXl to give a fragment (nucleotides 233 to 
1019) which was incorporated into the baculo- 
virus transfer vector pVL 1392 (Pharmingen, San 
Diego, CA, U.S.A.). Generation and amplification 
of recombinant baculovirus was as described 
(23,24). For production of protease M, Spodoptera 
frugiperda (cell line SF9) was infected with am- 
plified recombinant virus to obtain nearly 100% 
infection as gauged by enlarged cells. Ninety-six 
hours postinfection, cells were harvested and ly- 
sed by sonication in mammalian lysis buffer, ad- 
justed to 500 mM NaCl and rocked for 1 hr at 
4°C. All subsequent purifications were done at 
4°C. 

The lysate was adjusted to 125 mM NaCl, 
loaded onto p-aminobenzamidine agarose (Sig- 
ma A7155, St. Louis, MO, U.S.A.), washed with 
loading buffer, and eluted with 25 mM NaP0 4 , 
0.02% NaN 3 , 500 mM NaCl, 10 mM benzami- 
dine, pH 6.0. The eluted fractions were loaded 
onto concanavalin A agarose (Sigma C8402) by 
rocking for 1 hr, washed with 25 mM NaP0 4 , 
0.02% NaN 3 , 500 mM NaCl, pH 6.0, and eluted 
in wash buffer containing 10% methyl-a-D- 
mannopyranoside (Sigma M6882). 
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FIG. 1. Identification of protease M (1G3) of 
DD gel and on Northern blot 

(A) DD gel: 21 PT and 21 MT-1 RNA was reverse 
transcribed with T 12 MG primer and PCR-amplified 
with T 12 MG and OPA1 primers in the presence of 
35 SdATP, run on a 6% acrylamide/urea sequencing 
gel, and exposed to X-ray film for 18 hr. The portion 
of the gel surrounding the differentially displayed 
0.28-kb band is shown. (B) Northern blot: 10 /xg of 
total cell RNA was Northern blotted and probed with 
32 P-labeled PCR-amplified 0.28-kb band from the 
DD gel shown in Panel A. 



Expression Vector Construct and 
Transfection 

A full-length cDNA clone was cut with EcoNl 
and BstXl to give a fragment that spanned nu- 
cleotides 233 to 1019. This fragment was incor- 
porated into pCMVneo plasmids (25) and 
checked for correct orientation of the insert. 
MDA-MB435C cells (5 X 10 6 ) were electropo- 
rated at 220 V with 10 /xg of this construct in the 
presence of 10 /xg/ml DEAE dextran. Vector 
alone was used as a negative control. Cells (10 6 ) 
were plated in five PI 00 dishes in Alpha + 5% 
FCS. After 14 days of selection in media contain- 
ing 1 mg/ml G418, the transfected clones were 
refed with media containing 0.5 mg/ml G418 for 
an additional week. Clones were picked in clon- 
ing cylinders, expanded, and maintained in Al- 
pha + 5% FCS containing 0.5 mg/ml G418. 



RESULTS 
Differential Display 

Total RNA from a primary breast cancer cell line 
(21 PT) was compared with that from a meta- 
static breast cancer cell line from the same pa- 
tient (21 MT-1) by differential display (DD). Ap- 
proximately 100 bands appeared for each primer 
pair tested, and on average two to three bands 
were differentially expressed. One of the bands 
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1 AGG^GGACAAAGCCCGATTGTTCCTGGGCCCTTTCCCCATCGCGCCTGG 

68 GGCAGGGGCGGGGGCCAGTGTGGTGACACACGCTCTAGCTGTCTCCCCGGCTGGCTG 

\ 35 TCCTGGGGACACJWSAGGTCGGCAGGCAGCACACAGAGGGXCCTACGGGCAGCTGTTCCTT 

202 CTCAAGAATCCCCGGAGGCCCGGAGGCCTGCAGCAGGAGCGGCC 245 



67 

134 

201 



FIG. 2. Protease M cDNA 

The cDNA sequence and putative pro- 
tein coding sequence of the longest 
clone from the 76N library is shown. 
The postulated pre and pro N-terminal 
amino acids are underlined. The pre- 
dicted cleavage sites of pre and pro 
amino acids after Ala 16 and Lys 21 , re- 
spectively, are indicated by arrows. The 
potential N-linked glycosylation site at 
amino acids 134-136 and Asp 191 at the 
bottom of the binding cleft are boxed. 
The residues of the catalytic triad ' 
(His 62 , Asp 106 , and Ser 197 ) are circled. 
The actual polyadenylation signal at 
nucleotide 1490 and an alternative 
polyadenylation signal at nucleotide 
1095 are underlined. 
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ATG AAG AAG CTG ATG GTG GTG CTG AGT CTG ATT GCT GCA GCC TGG GCA GAG 
Met Lya Lys Leu Met Val Val Leu Ser Leu lie Ala Ala Ala Trp Ala Glu 

GAG CAG AAT AAG^TTG GTG CAT GGC GGA CCC TGC GAC AAG ACA TCT CAC CCC 
Glu Gin Aan Lya Leu Val Hi a Gly Gly Pro Cys Aap Lya Thr Ser Hia Pro 

TAC CAA GCT GCC CTC TAC ACC TCG GGC CAC TTG CTC TGT GGT GGG GTC CTT 
Tyr Gin Ala Ala Leu Tyr Thr Ser Gly His Leu Leu Cya Gly Gly Val Leu 

ATC CAT CCA CTG TGG GTC CTC ACA GCT GCC/CACYrGC AAA AAA CCG AAT CTT 
He His Pro Leu Trp val Leu Thr Ala AlaV^iayCya Lya Lya Pro Aan Leu 

CAG GTC TTC CTG GGG AAG CAT AAC CTT CGG CAA AGG GAG AGT TCC CAG GAG 
Gin Val Phe Leu Gly Lys His Asn Leu Arg Gin Arg Glu Ser Ser Gin Glu 

CAG AGT TCT GTT GTC CGG GCT GTG ATC CAC CCT GAC TAT GAT GCC GCC AGC 
Gin Ser Ser Val Val Arg Ala Val He His Pro Asp Tyr Asp Ala Ala Ser 

CAT GAC CAG/gacWc ATG CTG TTG CGC CTG GCA CGC CCA GCC AAA CTC TCT 
His Asp GlnUsp/lle Met Leu Leu Arg Leu Ala Arg Pro Ala Lya Leu Ser 



AAC ACC ACC 
Asn Thr Thr 



GAA CTC ATC CAG CCC CTT CCC CTG GAG AGG GAC TGC TCA GCC 
Glu Leu He Gin Pro Leu Pro Leu Glu Arg Asp Cys Ser Ala 

AGC TGC CAC ATC CTG GGC TGG GGC AAG ACA GCA GAT GGT GAT TTC CCT GAC 
Ser Cys His He Leu Gly Trp Gly Lys Thr Ala Asp Gly Asp Phe Pro Asp 

ACC ATC CAG TGT GCA TAC ATC CAC CTG GTG TCC CGT GAG GAG TGT GAG CAT 
Thr lie Gin Cys Ala Tyr He His Leu Val Ser Arg Glu Glu Cys Glu His 

GCC TAC CCT GGC CAG ATC ACC CAG AAC ATG TTG TGT GCT GGG GAT GAG AAG 
Ala Tyr Pro Gly Gin He Thr Gin Asn Met Leu Cys Ala Gly Asp Glu Lys 



TAC GGG AAG 
Tyr Gly Lys 



GAT 
Asp 



TCC TGC CAG GGT GAT/TCt\gGG GGT CCG CTG GTA TGT GGA 
Ser Cys Gin Gly Asp^Ser^Gly Gly Pro Leu Val Cys Gly 

GAC CAC CTC CGA GGC CTT GTG TCA TGG GGT AAC ATC CCC TGT GGA TCA AAG 
Ssp His Leu Arg Gly Leu Val Ser Trp Gly Asn He Pro Cys Gly Ser Lys 

GAG AAG CCA GGA GTC TAC ACC AAC GTC TGC AGA TAC ACG AAC TGG ATC CAA 
Glu Lys Pro Gly Val Tyr Thr Asn Val Cys Arg Tyr Thr Asn Trp He Gin 

AAA ACC ATT CAG GCC AAG 977 
Lys Thr He Gin Ala Lys 244 

TGACCCTGACATGTGACATCTACCTCCCGACCTACCACCCCACTGGCTGGTTCCAGAACGTCTCTCA 

CCTAGACCTTGCCTCCCCTCCTCTCCTGCCCAGCTCTGACCCTGATGCTTAAT^AACGCAGCGACGT 

GAGGGTCCTGATTCTCCCTGGTTTTACCCCAGCTCCATCCTTGCATCACTGGGGAGGACGTGATGAG 

TGAGGACTTGGGTCCTCGGTCTTACCCCCACCACTAAGAGAATACAGGAAAATCCCTTCT 

TCCTCTCCCCAACCCTTCCACACGTTTGAT TTCTTCCTGCAGAGGCCCAGCCACGTGTCTGGAATCC 

CAGCTCCGCTGCTTACTGTCGGTGTCCCCTTGGGATGTACCTTTCTTCACTGCAGATTTCTCACCTG 

TAAGATGAAGATAAGGATGATACAGTCTCCATCAGGCAGTGGCTGTTGGAAAGATTTAAGATTTCAC 

ACCTATGACATACATGGGATAGC^CCTGGGCCGCCATGCACTCAATAA^GAATGTATTTTAAAAAAA 

AAAAAAAAAAAAA 1526 
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that was overexpressed in the 21 PT lane (with 
primer pair OPA 1/T12MG; 280 bp in Fig. 1A) 
was excised from the gel and PCR amplified. The 
resulting 2 80 -bp PCR product was used to probe 
a Northern blot (Fig. IB). Two bands were de- 
tected: a band of 1.7 kb, which was very abun- 
dant in 21 PT and barely detectable in 21 MT-1, 
and a band of approximately 1 kb, which was 
equal in both cells lines. The mixture was puri- 
fied, and the differentially expressed clone of 1.7 
kb was recovered. 

Protease M: Sequence Identification 

The 0.28-kb insert was used to screen a cDNA 
library constructed in AZapII from a normal hu- 
man mammary epithelial cell line (76N). The 
longest clone isolated was nearly full-length. This 
clone of 1526 nt contains 245 bp of 5 '-untrans- 
lated sequences, 732 bp of coding sequences 
(coding for a postulated protein of 244 amino 
acids), and 549 bp of 3 '-untranslated sequences 



(Fig. 2). The presumptive protein coding region 
begins with an ATG codon, which lies in a good 
Kozak consensus sequence (26), CGGCCATGA, 
and ends with a TGA translation stop codon. The 
amino terminal portion has 13 consecutive hy- 
drophobic residues (Leu 4 to Ala 16 ) which is char- 
acteristic of a signal peptide followed by Glu 17 - 
Glu-Gln-Asn-Lys 21 , which resembles a pro- 
peptide with a potential trypsin susceptible 
cleavage site after Lys 21 . A potential N-linked 
glycosylation site is found at Asn 134 -Thr-Thr 136 . 
The expected polyadenylation signal AATAAA 
was found 1 1 bp upstream of the poly A tail at 
1490 bp. Another polyadenylation signal 
AATAAA was found at 1095 bp. 

The postulated protein sequence was com- 
pared with the four most closely related proteins 
using the Pileup and Distances programs, and the 
comparison was displayed by the Prettyplot pro- 
gram (Fig. 3). Glandular kallikrein 2 (4,5) has 
44% exact matches and 48% matches with con- 
servative changes. Trypsin I (27) has 43% exact 
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FIG. 3. Alignment of protease M with closely related members of the serine protease family 

The GCG Pileup and Pretty plot programs were used to align protease M with closely related human serine pro- 
teases. These are (from top to bottom): glandular kallikrein -hk2 (accession no. SP|P06870|), PSA-hk3 (accession 
no. SP|P07288|), pancreatic kallikrein-hk-1 (accession no. SP|P2051l|), and trypsinogen 1 (accession no. 
SP|P07477|). Amino acids comprising the catalytic triad are marked with an asterisk. The 29 "invariant" amino ac- 
ids (Dayhoff) are marked with a dot^or an asterisk. 



matches and 49% match with conservative 
changes. Both glandular kallikrein 1 (3,28-31) 
and prostate-specific antigen (32-36) contain 
39% exact matches and 44% match with con- 
servative changes. The catalytic triad of serine 
proteases is conserved in the new protease (i.e., 
histidine 62 , aspartate 106 , and serine 197 ). The 
presence of aspartate at position 191 predicts that 
this protein will produce trypsin-like cleavage, 
unlike PSA, which has a serine at the corre- 
sponding position and produces chymotrypsin- 
like cleavage. 

Protease M, contains 12 cysteine residues. 
Ten of these are conserved in the two kallikreins, 
PSA, and human trypsin, and would be expected 
to form the following disulfide bridges: Cys 28 - 
Cys 157 , Cys 47 -Cys 63 , Cys 138 -Cys 203 , Cys 168 - 
Cys 182 , and Cys 193 -Cys 218 . The other two cys- 
teines (Cys 131 and Cys 231 ) are not found in the 
kallikreins, PSA, and human trypsin, but are 
found in similar positions in bovine trypsin and 
would be expected to form a disulfide bond. 

Twenty-seven of the 29 "invariant" amino 
acids surrounding the active site of serine pro- 
teases (37) are conserved in protease M. One of 
the two nonconserved amino acids in protease 
M, lieu 155 in place of Leu, is a conservative 



change. The other nonconserved amino acid, 
His 161 instead of Pro, is also found in glandular 
kallikrein and PSA. The kallikreins and PSA have 
11 amino acid residues, 109-119, which are not 
found in protease M or trypsin. The function of 
these amino acids is not clear, but they would be 
expected to form the so-called kallikrein loop 
which would determine substrate specificity 
(38). 



Chromosomal Localization 

In human DNA, the protease M probe detected a 
major EcoRl fragment of about 18 kb, while in 
mouse and hamster DNAs several smaller frag- 
ments were detected. The human protease M 
fragment in the hybrid clones segregated with 
human chromosome 19 (data not shown). There 
were no discordancies for localization to chro- 
mosome 19. To sublocalize the protease M locus, 
two-color FISH was carried out, using the 
genomic clone AIG3-1 as a probe. A total of 22 
cells were analyzed. Fluorescent signals on one 
or both chromatids were found in the telomeric 
region of 19q in 15 metaphases spreads (Fig. 4). 
Twin-spot signals were not observed on any 
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FIG. 4. Chromosomal location of Protease M 

The protease M locus was mapped to 19ql3.3 by 
FISH. A genomic protease M probe was co-hybrid- 
ized with an a -satellite probe specific for chromo- 
somes 1, 5, and 19. Arrows point to protease M-spe- 
cific hybridization signals in the telomeric region of 
the long arm of both chromosomes 19 at band 
q!3.3. 



other chromosome. Comparison of the banding 
pattern of chromosome 19 following DAPI-stain- 
ing allowed us to assign the protease M locus to 
19ql3.3. 




Expression of mRNA in Mammary and 
Prostate Cells 

Figure 5A shows the results of Northern blots of 
mammary cell lines and strains. The two normal 
cells strains shown (76N and 70N) and another 
normal cell strain (8 IN) not shown expressed 
the 1 .7-kb protease M message at low levels. Two 
primary tumor lines (21 PT and 21 NT) as well as 
one metastatic line from the same patient (21 
MT-2) expressed high levels of message (approx- 
imately 20- to 100-fold higher than the normal 
strains). The most metastatic cell line from the 
same patient (21 MT-1), however, expressed low 
levels of RNA (Fig. 1A). One other primary tu- 
mor cell line (BT474) and nine other metastatic 
cell lines (MCF-7, T47D, ZR-75-1, MDA-MB- 
157, MDA-MB-231, MDA-MB-361, MDA-MB- 
435, MDA-MB-436, and BT549) did not have 
detectable message. Figure 5B shows Northern 
blots of prostate cell lines. The normal, immor- 
talized cell strains CF3 and CF91 express moder- 
ate levels of protease M mRNA, while another 
normal immortalized strain, MLC, expresses just 
trace amounts. In contrast, all three of the tumor 
cell lines examined (DU145, LNCaP, and PC 3) 
failed to express any protease M message. 



B NORMAL TUMOR 
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-36B4 



FIG. 5. Protease M mRNA expression in mammary and prostate cell lines 

(A) Ten micrograms of total mammary cell RNA was run on an agarose /formaldehyde gel, blotted, hybridized to 
P-labeled protease M probe, and exposed to X-ray film for 20 hr. (B) Ten micrograms of total prostate cell RNA 
was blotted and hybridized (as in Panel A) and exposed to X-ray film for 20 hr. 
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FIG. 6. Protease M mRNA expression in ovar- 
ian tissue 

Ten micrograms of total cell RNA isolated from ovar- 
ian tissue was^blotted, hybridized to protease M 
probe (as in Fig. 4), and exposed to X-ray film for 5 
days. 




-1G3 



Expression of mRNA in Ovarian Cell Lines 
and Tissue 

A series of normal immortalized and primary 
tumor derived ovarian cell lines were examined 
for expression of protease M mRNA on Northern 
blots. The message was not expressed in any of 
the five normal immortalized cell lines, but was 
detected in five of the eight primary tumor cell 
lines examined (not shown). We also exairiined 
the RNA from a series of normal ovarian tissue 
and biopsies from primary tumors (one of the 
two Northern blots is shown in Fig. 6). While 
mRNA was not expressed in the three normal 
tissues examined, the six borderline ovarian tu- 
mor tissues, or the two metastatic tumors from 
colon primaries, it was expressed in 16 of the 
20 primary ovarian tumor tissue specimens 
examined. 



Expression of Protease M mRNA in 
Normal Human Tissue 

A Northern blot containing 2 /ttg of polyA + RNA 
from eight normal human tissues (Clontech, Palo 
Alto, CA, U.S.A.) was examined for expression of 
protease M (Fig. 7). While the message was not 
detected in heart, placenta, lung, liver, or skeletal 
muscle, high levels of message were detected in 
brain, kidney, and pancreas. The message de- 
tected in brain and kidney was 1.7 to 1.8 kb, but 
the message detected in pancreas was only about 
1.2 kb. A probable explanation for the' smaller 
message in pancreatic RNA would be the use of 
the alternative polyadenylation signal at 1090 bp 
noted in Fig. 1. 
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FIG. 7. Protease M mRNA expression in hu- 
man tissue 

A Northern blot containing 2 jxg of polyA" 1 " RNA 
from normal human tissue (Clontech) was hybrid- 
ized to protease M probes (as in Fig. 4). The blot was 
exposed to X-ray film for 2 days. 



Production of Polyclonal Antibody and Its 
Use to Study Expression of Protein in 
Mammary Cell Lines and Strains 

A polyclonal antibody was produced in rabbits 
against a hydrophilic peptide which was not 
highly conserved among other serine proteases 
( 73 GKHNLRQRESSQEQS 87 ). The Western blot 
(Fig. 8) shows that the antibody detects a protein 
of 37 kD in total cell lysates of the normal mam- 
mary epithelial cell strain 8 IN and in the primary 
tumor cell line 2 INT. Protease M protein is not 
detected in the metastatic breast cell line MDA- 
MB-435. In other Western blots (not shown), 
the antibody detected a 37-kD protein in the 
normal strains 70N and 76N, as well as the pri- 
mary tumor cell line 21PT, but not in the meta- 
static cell lines T47D and MCF-7. Up to 1 ml of 
conditioned media from 70N and 2 INT was ex- 
amined in Western blots probed with this anti- 
body, and no reacting proteins were detected 
(not shown). This result suggests that the protein 
is primarily localized intracellularly and not se- 
creted. The protein detected by the antibody is 37 
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FIG. 8. Expression of protease M protein in 
mammary cell lines and insect cells infected 
with recombinant protease M baculovirus 

Fifty micrograms of total cell ly sates from mammary 
cell lines, uninfected insect cells (SF9), and insect 
cells infected with either 4.5 jd of recombinant pro- 
tease M baculovirus (SF9/1G3 [1]) or 22.5 /d of re- 
combinant baculovirus (SF9/IG3 [2]) were run on a 
12% polyacryamide/SDS gel, transferred to PDVF 
membrane, and reacted with protease M polyclonal 
anti- peptide antibody as the primary antibody and 
horseradish peroxidase-conjugated anti-rabbit IgG 
secondary antibody. Bands were detected with ECL. 



TABLE 1. Expression of protease M mRNA 
and protein in mammary cells 



Cell Line 


RNA* 


Protein* 


70N 


5 


100 


81N 


4 


60 


16-1-1 (76N/HPV16) 


4 


64 


2 INT 


85 


47 


21PT 


100 


76 


MDA.MB435 - 


0 


0 


T47D 


0 


0 


MCF-7 


0 


0 



*RNA values were obtained by running 10 fig of total RNA 
on a Northern blot, by hybridizing it to 32 P-labeled protease 
M probes, and by quantitating the resulting autoradio- 
grams. The most intense band was set equal to 100 and the 
other values normalized accordingly. 
^Protein values were obtained by running 50 /Ltg of total 
cell lysates on a Western blot and probing it with the pro- 
tease M antibody, as described in Materials and Methods. 
The 37-kD bands on the autoradiograms were quantitated; 
the most intense band was set equal to 100 and the other 
values normalized accordingly. 



kD, while the amino acid sequence predicts a 
protein of about 27 kD. The potential glycosyla- 
tion site at ( 134 Asn-Thr-Thr 136 ) might explain 
this size discrepancy. 

Table 1 shows that the RNA levels for the 
serine protease are not always correlated with 
the protein levels. While the primary tumor cell 
lines (2 INT and 21PT) have 20 to 100 times 
more protease M mRNA than normal cell strains 
(70N, 76N, and 8 IN), the protein detected. on 
Western blots is equal to or somewhat lower in 
the primary tumor cell lines than in the normal 
cell strains. 

The antipeptide polyclonal protease M anti- 
body has been used successfully in Western blots 
but does not seem to work in cellular immuno- 
fluorescence studies in which the antibody has 
given a high background with MDA-MB-435 
cells, which do not express the protease M mes- 
sage. 

Production of Recombinant Protein 

Extensive efforts were made to produce recom- 
binant protein for further study of the protease. 
As discussed below, production neither in E. coli 
as a GST-fusion protein nor in baculovirus as a 
pure protein was successful in providing more 
than minimal amounts of the protease. As a re- 



sult, the products that were recovered were used 
primarily to verify the specificity of the antibody 
preparations. 

In a further effort to obtain recombinant pro- 
tein, transfectants were produced expressing 
protease M in the mammary tumor cell line 
MDA-MB-435. Transfectants were screened ini- 
tially for protein production and, as shown be- 
low, the results demonstrated that only 5 of the 
76 transfectants produced any protein and this at 
low levels. 

Production of GST Fusion Protein 

The expected 52-kD GST/protease M fusion pro- 
tein was purified and yielded approximately 600 
fjig of fusion protein per 500 ml culture. When 
the" fusion protein was cleaved by incubation 
with thrombin, the protease M fragment was 
degraded, even at limiting dilutions, while only 
the GST portion remained intact. At least 1 ixg of 
fusion protein was required to obtained a detect- 
able signal on Western blots. 

Production of Baculovirus Recombinant 
Protein 

A Western blot containing 50 fjug of lysates pre- 
pared from SF9 cells infected with an amplified 
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stock of protease M recombinant baculovirus 
was probed with anti-protease M antibody 
(Fig. 8). Whereas reacting proteins were not de- 
tected in the lysate from uninfected SF9 cells, a 
protein of 39 kD was detected in lysates of SF9 
infected with recombinant baculovirus. Sf9/1G3 
(1) had approximately 50% infected, enlarged 
cells, and SF9/1G3 (2), which was infected with 
five times as much virus, had nearly 100% in- 
fected cells. The amount of recombinant protein 
was, however, quite low, and we were not able 
to detect a band of 39 kD on Commassie blue- 
stained gels (not shown). 

We attempted to purify recombinant pro- 
tease M from lysates. By using p-aminobenzami- 
dine agarose affinity chromatography followed 
by concanavalin A agarose, recombinant pro- 
tease M was purified approximately 80-fold. The 
protein was, however, still only 10% pure, as 
determined from silver-stained gels, and the 
yield was less than 1 fig! 1 0 s cells. Using this data, 
we calculated that 50 fig of lysate contains 15 ng 
of protease M or 0.03% of the total protein. 
Furthermore, by comparing the amount of the 
39-kD band on silver- stained gels of the 80-fold 
purified protease M with Western blots of the 
purified protein, we determined that the anti- 
body can detect 5 ng of protease M protein as a 
lower limit. 

MDA-MB435 Transfectants 

A pCMV/neo/protease M construct and a neo- 
vector control were transfected into MDA-MB- 
435 cells (5 X 10 6 cells for each construct) by 
electroporation. Eighty colonies of protease 
transfected clones and 20 colonies of vector 
transfected clones were transferred to 24-well 
dishes. The protease-transfected cells grew more 
slowly and had more enlarged, dying cells than 
the vector controls. Total cell lysates were pre- 
pared from the 76 protease transfectants when 
the cells were approximately 70% confluent. 
Western blots, prepared from 50 /ig of the lysate 
from the 76 transfectants as well as 50 /xg of 
lysate from 70N (positive control), were probed 
with the protease M antibody. Only 2 of 42 fast- 
growing clones and 3 of 34 slow-growing clones 
expressed any detectable protein (data not 
shown). Furthermore, the level of protein ex- 
pressed by these positive clones was, in all cases, 
considerably less than in 70N cells. 

Table 2 shows that protease M RNA was 
found in clones expressing protein as well as the 
majority of those not expressing protein. Thus, in 



TABLE 2. Analysis of protease M RNA and 
protein expression in MDA-MB-435 
tranfectants 





RNA" 


Protein*' 


70N 


12 


100 


MDA-MB-435 


0 


0 


Protease M transfectant 






#13 


4 


0 


#19 


10 


0 


#42 


96 


25 


#44 


61 


12 


#53 


0 


0 


#58 


100 


0 


#59 


6 


0 


#64 


22 


0 


#65 


44 


25 


#66 


55 


63 


#75 


22 


0 


#86 


0 


0 



a fr These values were determined as in the footnote to 
Table 1. 



MDA-MB-435 cells either the message is not 
translated efficiently or the translated protein is 
extremely unstable. 



DISCUSSION 

In a search for novel genes involved in metasta- 
sis, we have isolated by differential display 
mRNAs whose expression differs in a primary 
breast tumor cell line and a metastatic cell line 
derived from the same patient. Here, we describe 
the isolation of an mRNA that encodes a novel 
member of the serine protease family closely re- 
lated in sequence to trypsin and to PSA. This 
novel gene, protease M, is strongly expressed in 
primary breast cancer cell lines and in primary 
ovarian cancers but is down-regulated in cell 
lines derived from breast tumor metastases. This 
expression pattern suggests that protease M may 
be important in establishing breast and ovarian 
primary tumors, and may function later in pro- 
gression as a potential metastasis inhibitor. 

The projected sequence of the encoded pro- 
tein had about 40% amino acid identity to tryp- 
sin and members of the kallikrein family (glan- 
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dular kallikrein, pancreatic kallikrein, and 
prostate-specific antigen). Structural features im- 
portant for serine protease activity, such as the 
catalytic triad, the residues lining the binding 
cleft, and the cysteine bridges, were almost per- 
fectly conserved. Unlike the members of the kal- 
likrein family, protease M and trypsin lack the 
kallikrein loop at amino acid residues 109-119, 
which is important for kallikrein specificity. The 
size of the detected protein is 36 kD rather than 
the predicted size of 27 kD. This size discrepancy 
could be accounted for by glycosylation at Asn 134 
as in PSA (9). 

The protease M gene was mapped by somatic 
cell hybrid and FISH analyses to chromosome 
19ql3.3. Several other members of the serine 
protease family, including pancreatic/renal kal- 
likrein (KLK1), glandular kallikrein (KLK2), and 
PSA, map to 19ql3.3 (39). Physical mapping of 
this region has revealed that all three kallikrein 
genes are clustered within a 60 -kb region and 
that the order is KLK 1 -PS A-KLK2 (40). Our 
mapping data suggest that protease M may also 
be part of this gene cluster and that all four genes 
may have originated from a single ancestral pre- 
cursor gene. Trypsin 1 (TRY1) and other more 
distantly related serine proteases, such as gran- 
zyme B (14qll.2), cathepsin G (14qll.2), coag- 
ulation factor VII (13q34), and protein C (2ql3- 
q21), however, are all on^ other chromosomes, 
indicating that during evolution this gene cluster 
has been split up to several human chromo- 
somes. Interestingly, chromosome band 19ql3 is 
nonrandomly rearranged in a variety of human 
solid tumors, including pancreatic carcinomas, 
astrocytomas (grades in and IV), hemangioperi- 
cytomas, ovarian cancers, and thyroid tumors 
(41). Whether any of these rearrangements af- 
fect protease M remains to be determined. 

Protease M mRNA was expressed in normal 
breast and prostate cell strains but not in cell 
lines derived from metastatic tumors. The basis 
for this down-regulation is not known; however, 
since Southern blots show that the gene is nei- 
ther lost nor grossly rearranged in breast tumor 
cell lines (data not shown), it may be transcrip- 
tional. While protease M mRNA was expressed 
in normal human brain, kidney, and pancreatic 
tissue, it was absent in heart, placenta, lung, 
liver, and skeletal muscle. This expression pat- 
tern resembles that of the kallikrein HKLK1 (3). 
In contrast, PSA is expressed primarily in the 
prostrate, though low levels have recently been 
detected in breast tissue (42). 

We have produced a polyclonal anti-peptide 



antibody to protease M and have detected the 
protein in whole cell lysates, but not in condi- 
tioned growth media. The question of whether 
protease M is secreted, however, remains open 
Based on its sequence, the protein appears capa- 
ble of secretion. The amino-terminal sequences 
encode putative pre and pro regions, and the 
related pancreatic kallikrein and trypsin proteins 
are secreted. Since our anti-peptide antibody is 
rather weak (a lower limit for detectability of 5 
ng on Western blots), detection of low amounts 
of secreted protein in conditioned media and 
biological fluids will have to await the produc- 
tion of a high-affinity antibody to use, for exam- 
ple, in radioimmunoassays. 

The protease M protein was detected in ly- 
sates of normal breast epithelial cell lines, but not 
in breast metastatic cell lines, and correlated with 
the mRNA expression levels. In the primary 
breast tumor cell line 21PT, however, very high 
mRNA levels were observed, though the protein 
level was low. This lack of correlation of mRNA 
and protein levels was also observed in MDA- 
MB-435 cells that were transfected with a pro- 
tease M expression construct. Although many of 
the tranfectants expressed high levels of the 
mRNA, the protein was absent or barely detect- 
able. These observations suggest that the expres- 
sion of protease M is regulated both at the tran- 
scriptional and translational level. Whether the 
low protein levels are due to inefficient transla- 
tion or rapid degradation of the translated pro- 
tein is not known. Pulse-chase labeling experi- 
ments, followed by immunoprecipitation and 
Western blot analysis of protease M might eluci- 
date the mechanism of translational down-regu- 
lation. 

In breast, while the primary tumor cell lines 
(21 NT and 21 PT) expressed high levels of pro- 
tease M message, another primary tumor cell 
line (BT474) did not express this message. When 
breast tissue samples were examined, two of the 
four primary tumor biopsies produced high lev- 
els of protease M message, whereas samples from 
normal adjacent tissue or reduction mammoplas- 
ties produced lower levels of the message (data 
not shown). Therefore, high protease M mRNA 
might serve as a marker for a subset of primary 
tumors. This subset of primary tumors could be 
distinguished from normal tissue by the higher 
levels of protease M mRNA. Protease M mRNA 
was expressed in the majority of primary ovarian 
tumor cell lines and tissues, but not in borderline 
ovarian tumor tissue, normal cell lines, or nor- 
mal tissues. 
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These observations suggest that protease M 
might be useful as a diagnostic marker for pri- 
mary epithelial carcinomas. Furthermore, the 
close sequence relatedness to trypsin and the 
presence of a signal peptide in the N terminus 
raise the possibility that the protein may have 
valuable medical applications. 
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